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Abstract 

The wavefunction for the multiparticle Schrodinger equation is a function of many 
variables and satisfies an antisymmetry condition, so it is natural to approximate it as 
a sum of Slater determinants. Many current methods do so, but they impose additional 
structural constraints on the determinants, such as orthogonality between orbitals or 
an excitation pattern. We present a method without any such constraints, by which 
we hope to obtain much more efficient expansions, and insight into the inherent struc- 
ture of the wavefunction. We use an integral formulation of the problem, a Green's 
function iteration, and a fitting procedure based on the computational paradigm of 
separated representations. The core procedure is the construction and solution of a 
matrix-integral system derived from antisymmetric inner products involving the po- 
tential operators. We show how to construct and solve this system with computational 
complexity competitive with current methods. 
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I Introduction 

Given the difficulties of solving the multiparticlc Schrodinger equation, current numerical 
methods in quantum chemistry/physics are remarkably successful. Part of their success 
comes from efficiencies gained by imposing structural constraints on the wavefunction to 
match physical intuition. However, such methods scale poorly to high accuracy, and are 
biased to only reveal structures that were part of their own construction. Our goal is to 
develop a method that scales well to high accuracy and allows an unbiased exploration of the 
structure of the wavefunction. In this paper we take a step toward this goal by developing a 
method to approximate the wavefunction as an unconstrained sum of Slater determinants. 

Since the multiparticlc fermionic wavefunction is an antisymmetric function of many 
variables, it is natural to approximate it as a sum of Slater determinants, at least as a first 
step. Motivated by the physical intuition that electrons may be excited into higher energy 
states, the Configuration Interaction (CI) family of methods choose a set of determinants 
with predetermined orbitals, and then optimize the coefficients used to combine them. When 
it is found insufficient, methods to optimize the orbitals, work with multiple reference states, 
etc., are introduced (along with an alphabet of acronyms). A common feature of all these 
methods is that they impose some structural constraints on the Slater determinants, such 
as orthogonality of orbitals or an excitation pattern. As the requested accuracy increases, 
these structural constraints trigger an explosion in the number of determinants used, making 
the computation intractable for high accuracy. 

The a priori structural constraints present in Cl-like methods also force the wavefunction 
to comply with such structure, whether or not it really is the case. For example, if you use a 
method that approximates the wavefunction as a linear combination of a reference state and 
excited states, you could not learn that the wavefunction is better approximated as a linear 
combination of several non-orthogonal, near-reference states. Thus the choice of numerical 
method is not just a computational issue; it can help or hinder our understanding of the 
wavefunction. 

For these reasons, our goal is to construct an adaptive numerical method without im- 
posing a priori structural constraints besides that of antisymmetry. In this paper we derive 
and present an algorithm for approximating a wavefunction with an unconstrained sum of 
Slater determinants, with fully- adaptive single-electron functions. In particular we discard 
the notions of reference state and excitation of orbitals. The functions comprising the Slater 
determinants need not come from a particular basis set, be orthogonal, or follow some ex- 
citation pattern. They are computed so as to optimize the overall representation. In this 
respect we follow the philosophy of separated representations [U [5] , which allow surprisingly 
accurate expansions with remarkably few terms. 

Our construction generates a solution using an iterative procedure based on nonlinear 
approximations via separated representations. To accomplish this nonlinear approximation, 
we derive a system of integral equations that describe the fully-correlated many-particle 
problem. The computational core of the method is the repeated construction and solution 
of a matrix-integral system of equations. 

Specifically, our approach has the following distinctive features: 

• We use an adaptive representation for single-electron functions, but our method does 
not depend on its details. 

• We use an integral formulation of the multiparticlc Schrodinger equation and a Green's 
function iteration to converge to the ground-state wavefunction. The Green's function 
is decomposed and applied using separated approximations obtained by expanding the 
kernel into Gaussians. 
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• We use a variant of the so-called alternating least squares algorithm to reduce the 
error of our approximation using a sum of a given number of Slater determinants. 

• We compute antisymmetric inner products involving portions of the Hamiltonian oper- 
ator by reducing them to formulas involving only combinations of standard integrals. 
In particular, we avoid the direct application of the electron-electron potential and 
instead compute convolutions with the Poisson kernel. 

By doing this, we hope to represent the effects of correlations in the most natural and 
concise way possible, thus providing both computational efficiency and physical insight. We 
believe that this algorithm and the system of integral equations underlying it provide the 
foundation for a new approach to solving the multiparticle Schrodinger equation. We defer 
to the sequels several important issues, such as algorithmic sizc-consistency/extensivity and 
the treatment of the inter-electron cusp. 

In Section [11] we formulate the problem more carefully, make precise some of the state- 
ments that we made in this introduction, and give a high-level description of the method. 
We then present the derivations and proofs in the following sections. 

II Problem Formulation and Description of the Method 
II. 1 Formulation of the Problem 

We consider the time- independent, nonrelativistic, multiparticle Schrodinger equation, and 
fix the nuclei according to the Born-Oppenheimer approximation, so the equation describes 
the steady state of an interacting system of electrons. For each of the N electrons in the 
system there are three spatial variables r — (x, y, z) and a discrete spin variable a taking the 
values {—5, 5}, which we combine and denote (r, cr) by 7. The Hamiltonian operator 7i is a 
sum of a kinetic energy operator T, a nuclear potential operator V, and an electron-electron 
interaction operator W, defined in atomic units by 



where is the three-dimensional Laplacian acting in the variable and v{v) is a sum 
of terms of the form —Za/\\v— Ra|| from a nucleus at position with charge Za- The 
antisymmetric eigenf unctions of Ti represent electronic states of the system and are called 
wavefunctions. Antisymmetric means that under the exchange of any two coordinates, the 
wavefunction is odd, e.g. ^/'(72, 71, . . .) = — ^'(71, 72, ■ ■ ■)■ The bound-state wavefunctions 
have negative eigenvalues, and are of greatest interest. We will focus on the ground-state 
wavefunction, which has the most negative eigenvalue, although the techniques can be used 
for other states. In summary, our goal is to find E and il), with E the most negative 
eigenvalue in 



subject to the antisymmetry condition on -0. Analytic methods can give qualitative results 
about the solutions, and determine limiting cases, but most quantitative results must be 
obtained numerically. Although the equation is 'just' an eigenvalue problem, its numerical 
solution presents several serious difficulties, among them the large number of variables and 
the antisymmetry condition on the solution. The simplest method that addresses these two 
difhcultics is Hartree-Fock (HF) (see e.g. [28]), which uses the antisymmetrization of a single 
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product, called a Slater determinant, to approximate the iV-particle wavefunction, i.e. 
^ 1 

i=l 

Any antisymmetric approximation ip to the wavefunction -0 can be substituted into 

where (•, •) is the usual inner product, to obtain an estimate for E. This estimate gives an 
upper bound on the lowest value of E that solves ([2]). Substituting ^ into (jl]), one can 
iteratively solve for 0^ to minimize The resulting V^hf will best approximate 0, in the 
sense of providing the best estimate (|4]) . 

To improve upon HF, it is natural to consider the antisymmetrization of a sum of prod- 
ucts 

r N 

i^ir)=AJ2slY[(^ih^), (5) 

1=1 i=l 

which could also be written as a sum of Slater determinants. The coefficients si are in- 
troduced in order to have \\(f>\\\ = 1. Many methods are based on this form, but they use 
it in different ways. The Configuration Interaction (CI) method (see e.g. ^STj) chooses the 
functions 0' from a preselected master set of orthogonal functions and decides on a large 
number r of combinations to consider, based on excitation level. Substituting ([5]) into (j4]) 
leads to a matrix eigenvalue problem that can be solved for the scalar coefficients si. The 
Muhi-Configuration Self-Consistent Field (MCSCF) method (e.g. [201 HH) solves for the 
master set of orthogonal functions as well as the scalar coefficients. There are numerous 
variations and combinations of these methods, too many to describe here. 



^1(71) 
^2(71) 



01(72) 
02(72) 



(3) 



07V (71) 4>n{12) 



4>n{in) 



II. 1.1 What is New Here 

In this work we construct and demonstrate a method that also uses a wavefunction of the 
form ([5]) but without constraints on the <j)\. We remove both structural constraints, such as 
an excitation pattern or orthogonality between single-electron functions, and representation 
constraints, such as those imposed by using a predetermined basis set. 

Many methods (e.g. [SIllEllMlIIlITSllIlllIHlEllISlleOlllI]) have loosened the constraints 
on the Slater determinants in one way or another, often with encouraging results. These 
works, however, only partially removed the constraints, and so, we claim, did not achieve 
the full potential of an unconstrained approximation. By removing these constraints we 
hope to produce much better approximations at much smaller separation rank r than ex- 
isting methods allow. We also hope to provide new perspective from which to analyze and 
understand the wavefunction, free from the biases that physical intuition imposes. 

Our hopes are based on our work in [H [51 [33] , where we developed general methods to 
represent and compute with functions and operators in many dimensions. We used sums 
of separable functions, dubbed separated representations, similar to ([5]). We found rather 
natural examples where removing constraints produces expansions that are exponentially 
more efficient, i.e. r = N instead of 2^ or r = log TV instead of N . For example, in our 
approach we can have a two-term representation 



N N 
i=l i=l 



(6) 



an Unconstrained Sum of Slater Determinants 



5 



where {(t>j}'j^i form an orthogonal set. To represent the same function as ^ while imposing 
the constraint that factors come from a master orthogonal set would force one to multiply 
out the second term, and thus use a representation with 2^ terms. 

At present we have no proof that the wavefunction is well-approximated by a structure 
that would benefit from the removal of constraints. The size r needed in practice, and 
how it depends on the various parameters in the problem, is thus still an open question. 
In [H O I43j , the most interesting examples came from "reverse-engineering" the numerical 
results to obtain formulas and proofs. We therefore expect that the tools we provide here 
will allow an exploration of the wavefunction, perhaps revealing unexpected structure, and 
a strategy for a proof. 

II. 2 Description of the Algorithm 

The removal of constraints in ([5]), and, thus, the basis sets, coefficients, and other structure 
that went along with them, also eliminates the conventional strategies for constructing ^ to 
minimize Q . It leads one to consider how one would compute the ground-state wavefunction 
if its numerical representation were not an issue. We choose to use an integral iteration, 
which we sketch in Section III. 2. II In Appendix |^ we sketch an alternative iteration based 
on gradient descent. 

To use the form ^ we must choose some value of r, which determines the quality of 
the approximation. In Section III. 2. 21 we show how to incorporate a nonlinear fitting step 
within the integral iteration in order to maintain fixed r. Accomplishing this fitting requires 
a significant amount of machinery, which makes up the body of the paper. Eventually one 
would want to adaptively determine r, but we do not address that issue here. 

II. 2.1 A Green's Function Iteration 

The eigenvalue equation ^ contains the differential operator 7i, which has both the dis- 
crete negative eigenvalue(s) that we are interested in and unbounded, continuous, positive 
spectrum. In [311 132] this differential equation was reformulated as an integral equation, 
producing an operator with only discrete, bounded spectrum. Such integral formulations 
are in general far superior to differential formulations, since, e.g. numerical noise is sup- 
pressed rather than amplified. An iteration based on the integral formulation with Green's 
functions was introduced in [211 [32] and used in e.g. [IU[13. A rigorous analysis of this 
iteration is given in 44J based on classical theorems from [301, ^33^ [52! ESI [Ml • In this section 
we review this iteration, and then modify it in Section III. 2. 21 to preserve our wavefunction 
representation 

Define the Green's function 

g^ = iT-tii)-\ (7) 

for ^ < 0, and consider the Lippmann-Schwinger integral equation 

X^.i^^. = -g^.[{V + W)^b^]. (8) 

The subscript fi on and ip^ are to emphasize the dependence of the eigenvalues and 
eigenfunctions on /i. The operator t/^[(V+yV)] is compact, so ([5]) has only discrete spectrum. 
li IJ, = E, then there is an eigenvalue = 1 and the corresponding eigenfunction i/)^ of ([S|) 
is the desired ground-state eigenfunction of ([2|), as one can see by rearranging ([5]) into 
We note that other eigenfunctions may be obtained by defiation. 

When jj, — E, Xf^ — 1 is the largest eigenvalue, so a simple iteration like the power method 
yields the desired ground-state eigenfunction. The eigenvalues A^ depend analytically on /i, 
so when fi is sufficiently close to E the power method will still yield an eigenfunction of ([5]) 



an Unconstrained Sum of Slater Determinants 



6 



with energy near the minimum of Q. From ijj^ and one can construct an update rule 
for n, based for example on applying Newton's method to solve = 1. 

The convergence rate of the power method to produce ipp, and A^ is linear, and depends, 
as usual, on the gap between the eigenvalues in ([5]). The convergence rate of Newton's 
method to solve A^ = 1 is quadratic, so fi will converge to E quadratically, provided that 
A^ and ■0^1 have been found at each step. In the practical use of this approach, one does 
not wait for the power method to converge at each step, but instead intertwines it with the 
update of /i. Beginning with an approximation to the energy fiQ ^ E and an approximate 
wavefunction ^/jq, one converts ([5]) to an iteration 

^„ = -ep„[(V + >V)^„]. (9) 

After each iteration one normalizes by setting 

^Pn+l - i'n/UnW ■ (10) 

Following the approach of [26], we can use the update rule 

Mn+l = - ((V + >V)V'„, V'n " i^n)/Un\\^ , (H) 

which is equivalent to using Newton's method. 



II. 2. 2 Approximating with Fixed Separation Rank r 

We restrict the method to approximate wavefunctions of the form ([5]), with r fixed, by 
replacing the definition of ipn in We define 4'n to be the function of the form ^ that 
minimizes the (least-squares) error 

||^n-(-aM„[(V + W)V^„])||. (12) 

Since using instead of ^ introduces an error, the update rule PT|) may no longer give 
quadratic convergence, and in any case is not expected to converge to the true energy. One 
may choose to replace the update rule (llip with the more robust but slower converging rule 

Mri+1 — TTl iT9 ' y^"^) 



10, 



Tl + 1 



which is based on (jH). Other rules may be possible as well. At present we do not have 
enough numerical experience to decide which rule to prefer. 

The Green's function iteration itself does not enforce the antisymmetry condition. In 
order to assure convergence to an antisymmetric solution, we use the pseudo-norm induced 
by the pseudo inner product (•, ■)a = {A{-),A{-)), as we did in [5]. 

The least-squares problem ()12|) is non-linear, and so very difficult in general. To simplify 
notation in the description of our method, we now suppress the index n in (|12p and consider 
a single problem of that form. We begin by setting — ip, and then iteratively improve ip to 
reduce ([T2|) . Although we can see several strategies for improving ip, for concreteness we will 
restrict our description to the strategy most similar to 5 . To improve the approximation ip 
we loop through the variables (electrons). The functions in variables other than the current 
variable are fixed, and the functions in the current variable are modified to minimize the 
overall error p2p . The error p2p depends linearly on the functions in a single variable, 
so the minimization becomes much easier. This general Alternating Least-Squares (ALS) 
approach is well-known (see e.g. [27l [36l [38l [TOl [Ml [58]). Although to minimize ([T2)) one 
may need to loop through the variables multiple times, it appears to be more cost effective 
to loop only once and then do the next Green's function iteration. We alternate through the 
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directions, but for ease of exposition we describe the fc = 1 case. So, 0^ is fixed for fc > 1, 
and we will solve for the values of 4>{ for all I. 

To refine in the current variable, we set up and solve a linear least-squares problem. The 
normal equations for a least-squares problem are derived by taking a gradient with respect 
to the free parameters and setting the result equal to zero. As long as the approximating 
function is linear and not degenerate in these parameters, the resulting equations are linear 
and have a unique solution, which minimizes the error with respect to these parameters. 
Usually these free parameters are coefficients of the representation in some fixed basis. For 
example, to find the coefficients {c,} to minimize 



= (f -J2'''9^J -J2'''9^) , (14) 



(15) 



i 

construct the normal equations 
with 

A{k,i) ^ {gk,gi) and b{k)^{gk,f), (16) 

solve them, and set Ci — x{i). Instead of using coefficients in some basis as our parameters, 
we take the parameters to be the point values of our functions so that the gradient 
becomes a variational derivative. Formally, we consider a basis of delta functions (5(7 — •) 
and let their coefficients be our parameters. We still obtain linear normal equations but 
now b and x are vectors of functions, and A is a matrix of integral operators. Specifically, 
b{l) is a function of 7, x{l') is a function of 7', and A(l, I') is an integral operator mapping 
functions of 7' to functions of 7. The kernels in A are formally defined by 

IN N \ 

A(^n(7,70 = s>>U(7-7i)n<^'(^»)''5(^'-^i)n^''(^»)) ' (17) 

and the functions in b are defined by 

m{l)^hj2s^nUl-ll)\{~^\{l^),-g,[V + yV]\{c^Til^)) ■ (18) 



i=2 




Once we solve (|15p . we set — x{l). To enforce the normalization convention \\4>{\\ = 1 we 
can divide (f>\ by its norm and incorporate the norm into s;. 

To solve the matrix-integral system p^ . we need an iterative method for solving linear 
systems that uses only operations compatible with integral operators, such as matrix-vector 
products, vector scales and additions, and vector inner products. Typically the matrix A 
in normal equations is positive-definite. Our operator A is only semidefinite due to the 
nuUspace in the antisymmetric pseudonorm. Fortunately, b was computed with the same 
pseudonorm and has no component in the nuUspace of A, so we can still use methods for 
positive-definite matrices. Based on these considerations, we choose to use the Conjugate 
Gradient iterative method (see e.g. [21]) to solve One initializes with r b — Ax, 

V = r, and c = (r,r), and then the core of the method is the sequence of assignments 
z ^ Av, t <— c/(v, z), X ^ x + <v, r ^ r — <z, (i ^ (r, r), v ^ r + {d/c)v, and c ^ d, 
applied iteratively. 

One advantage of using this iterative method with integral operators is that our algorithm 
does not rely on any particular basis. The representation of x can naturally be adaptive in 
7, for example refining near the nuclei as indicated by the refinement in b. We assume the 
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availability of some adaptive, high-accuracy representation for single-electron functions, such 
as the polynomial multiwavelet representation demonstrated in [251 126j , which effectively 
eliminates the basis-set error. For the estimates of computational cost, we use M to denote 
the cost to represent a function of 7, or integrate such a function. The antisymmetry 
constraint requires N < M, and in general we expect M to be much larger than N. 

II. 2. 3 Summary of the Remainder of the Paper 

The core of the paper is the development of the methods needed to construct A in (fT7|) and 
b in (jlSp . First, in Section IlIIl we develop the machinery and algorithms for computing 
antisymmetric inner products involving the operators T, V, and W. Our formulation uses 
low-rank perturbations of matrices, thus avoiding cofactor expansions. We also avoid explicit 
construction of W by incorporating its effect via spatial convolutions with the Poisson kernel 
in three dimensions. Second, in Section llV| we show how to compute antisymmetric inner 
products involving these operators and the delta function (5(7 — 71). Again the key is to use 
low-rank perturbations of matrices. 

In Section|V]we assemble all our tools to demonstrate how to perform our main algorithm, 
and in particular how to construct A in (fT7|) and b in (fT8|) . We also gather the computational 
cost for the whole method. The cost depends on the number of electrons N, the separation 
rank r, the one-particle representation cost M , the number of Green's function iterations / 
(see Section ril.2.1[) . and the number of conjugate gradient iterations S (see Section Hi. 2. 21) . 
Although S in theory could be as many as the number of degrees of freedom rM , we have a 
very good starting point, and so expect only a very small constant number to be needed. We 
use M log M to denote the generic cost to convolve a function of 7 with the Poisson kernel 
l/||r||. A Fourier-based Poisson solver on a uniform grid would achieve this complexity; for 
adaptive methods such as we use it is very difficult to state the cost (see [71 E]). We use L 
to denote the number of terms used to approximate the Green's function to relative error e 
with Gaussians, and prove in Section rvH that L = ©((Ine)^) independent of fj. and N. The 
final computational cost is then 

0{Ir^N^[L{N + M log M) + S{N + M)]). (19) 

For comparison, the cost to evaluate a single antisymmetric inner product via Lowdin's rules 
is 0{N^{N + M)). 

II. 3 Further Considerations 

We have implemented the method developed here and tested it sufficiently to verify the 
correctness of the algorithm as presented. The numerical results are too preliminary to allow 
us to make any particular claims at this point, however, so we will present them separately. 
The linear algebra accelerations based on Appendix [B] have not yet been implemented. 

We develop the method in terms of the total variable 7 without specifying the spin 
states. If a specific spin state is imposed on our initial trial wavefunction -00, the iteration 
will preserve this state. 

The representation ([5]) does not account for the inter-electron cusp (see e.g. [56l |46j 
l35l l49l [50I f34l [37]). and thus we cannot hope to achieve small error e in the wavefunction 
with small r. As with Configuration Interaction methods, we may still be able to achieve 
small error in the energy difference of two systems, which is often the quantity of interest 
in Chemistry. For the current work, we fix r and adapt 4'\{li) si to minimize the error 
e, rather than fixing e and adaptively determining r. We are developing an extension to ([5]) 
that incorporates the cusp, and hope to achieve small error e through it. 
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Similarly, ^ is not size-consistent/extensive, and thus is not suitable for large systems. 
We are also developing an extension to ^ suitable for large systems, and hope to achieve 
linear scaling through it. 

Although we have focused on the multiparticle Schrodinger equation, the tools that we 
have developed are another step towards general-purpose, automatically adaptive methods 
for solving high-dimensional problems. 



Ill Antisymmetric Inner Products 

In this section we develop methods for computing antisymmetric inner products involving 
W, V, and T. For this purpose, after setting notation, we develop methods for computing 
with low rank perturbations of matrices, review the antisymmetry constraint and define a 
notion of maximum coincidence. With these tools we then derive the main formulas. 



III.l Notation 



We denote a column vector with suppressed indices by F and with explicit indices by F{i). 
We denote its conjugate transpose by F*. We use to denote the column vector that 
is one in coordinate i and zero otherwise. A linear operator is written C We denote a 
matrix with suppressed indices by L and with explicit indices by L(i,j). Recalling that 
r = (x, y, z) e R^, we combine spatial integration with summation over spins and define the 
integral 

' f{l)dl= J2 f fi^^^)dr. (20) 
We define the action of the single-electron kinetic and nuclear potential operators by 



(T,+V.)[/](7)= (-^A + «(r)')/(7)= (-iA + «(r))/(r,a). (21) 



In what follows we will reduce the action of the inter-electron potential operator W to 
convolutions with the Poisson kernel, so we define 



[/](!■) 



1 



o-'e{-l/2,l/2} " " 



(22) 



We allow these operators to be applied componentwise to vectors and matrices of functions. 
Next, we define $ — YiiLi 't>i{li)i so for example we can write ($,<i>)^ instead of 

We also associate with the product <I> a vector of N func- 



tions of a single variable. 



(23) 



We can then, for example, construct a new vector of functions by applying a matrix to 
an old one, as in = L^^^. Although we do linear algebra operations on these vectors, 
we note that $ + ^ does not correspond to $ -I- so there is not a true vector-space 
structure. Our formulas contain fairly complicated expressions with such vectors, such as 
J $*yv^ [0$*]0c?7. To parse this expression, we note that is a column vector of fimctions 
and is a row vector of functions, so 0$* is a matrix of functions. Then [0^>*] is 
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still a matrix of functions, but applying on the left and on the right yields a single 
function, which is integrated in the implied variable 7 to yield a number. When explicit 
specification of the variable involved is needed, the notation $(7) indicates that the single 
variable 7 is used in all the functions. 

III. 2 Determinants of Low-Rank Perturbations of Matrices 

Since the antisymmetric inner product involves determinants, we will use some linear algebra 
relations for them. Proposition [2] in this section is used heavily, and is the key to avoiding 
rather unpleasant cofactor expansions. 

Proposition 1 (Determinant via Schur Complement) Let A be a nonsingular square 
matrix, D a square matrix, and B and C matrices of appropriate size. Then 

= |A| |©-CA"^B| . (24) 




Proof: (see e.g. [5T]) It is easy to verify directly that 

" A B 

C © 

Since the determinants of the first and third matrices are equal to one, the determinant of 
the middle matrix gives the desired result. □ 



(25) 



Proposition 2 (Determinant of a Perturbation of the Identity) Let {uq}'j^^^ and {vq}'j^^^ 
be two sets of vectors of the same length, and UgV* denote the outer product o/Ug, andwq. 



Then 



Q 

9=1 



1+V^Ui V^U2 
V^Ui 1+V^U2 



VnU2 



V2UQ 
1 + V^UQ 



(26) 



Proof: Let U be the matrix with the vectors {u^} as its columns, and V the matrix with 
the vectors {vg} as its columns. Note that U and V are of the same size. By Proposition [1] 
we have 

I U 



-V* I 



= |I + V*U| 



(27) 



which evaluates to the right side of P5|) . Exchanging the roles of A and D in Proposition [T] 
we have 

I U 



-V* I 



|I + uv* 



(28) 



which evaluates to the left side of (p6|) . □ 
The Q = 1 case is well-known (see e.g. [51]) but we have not found the general case in the 
literature. 
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III. 3 The Modified Pseudo-inverse 

The singular value decomposition (SVD) (e.g. [21]) of a x iV matrix is 

N 

A = J2 Si^i^i = USV* , (29) 

i=l 

where the matrices U and V are unitary and the singular values {si} are non-neganive and 
in descending order. The left singular vectors {u^} form an orthonormal set, as do the right 
singular vectors {vj}. The pseudo-inverse is defined as 

N-Q 

= E ^r'v.u* , (30) 

where Q is the dimension of the (numerical) nuUspace. We also define a projection matrix 
onto the nuUspace 

Definition 3 

N 

A^- J2 v.u* (31) 

and a modified pseudo-inverse 

Definition 4 (Modified Pseudo-Inverse) 

+ A-L . (32) 



Note that A-'- and thus A-^ are not uniquely defined since the choice of basis for the nuUspace 
is not unique. For our purposes any consistent choice works. The modified pseudo-inverse 
behaves much like the pseudo-inverse, but always has a non-zero determinant. 



|A*| = |U|r 



^0. 



(33) 



III. 4 The Antisymmetrizer and Lowdin's Rule 

Given a separable function, its antisymmetric projection can be found by applying the 
antisymmetrizer A (see e.g. [3S1), also called the skew-symmetrization or alternation (see 
e.g. pSi, i51j ). resulting in a Slater determinant. In the vector notation we have 



1 

m 



*(7i) 



*(7Jv) 



1 

iV! 



01 (71) 01(72) 
02(71) 02(72) 

(f>N ill) 0Af(72) 



0l(7Ar) 
02(7Ar) 



0w(7- 



N 



(34) 



One cannot explicitly form a Slater determinant for large N since it would have A^! 
terms. However, one can compute the antisymmetric pseudo inner product 



(l>,$)^ {A^,A<i>) = {^,A^) - {A^,<^>), 



(35) 
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where the first equaUty is a definition and the others foUow since A is an orthogonal pro- 
jector. It is not a true inner product because it has a nuUspace. To compute (j35p . first 
construct the matrix L with entries 

L(z,j) = (^.>j) (36) 

at cost 0{N'^M). Then use ($, <I>)^ = {A^, $) and move the integrals inside the determi- 
nant to obtain 

(*,$)^ = ^|L|, (37) 

which is the so-called Lowdin's rule (e.g. [40l |48]). Since L is an ordinary matrix, its 
determinant can be computed with cost 0{N^) (or less). The denominator A^! need never 
be computed, since it will occur in every term in our equations, and so cancels. 

Our method for enforcing the antisymmetry constraint, as described in [S], is to use the 
pseudo-norm based on the antisymmetric inner product {■,-)a for the least-squares fitting 



III. 5 Maximum Coincidence 

Consider two products, $ = 11^=1 'Piili) ^ ^ 11^=1 4'i{li)i stored in the vector notation 
of (P^l) as $ and To specify which functions were used to compute L in (|36p . we use the 
notation L($, $). The matrix of inner products L = L(4>, $) is in general full. Defining 

0=L-i*, (38) 

we have 

^e = ^|[ (L-i*)(7i) ••• (L-i*)(7^) ]| 

= |L"'I]^|[ *(7i) ••• Hin)]\ = \L-'\A^- (39) 

Thus the antisymmetrizations of $ and are the same up to a constant, and we can use O 
instead of $ in calculations. The advantage of using Q is that the resulting matrix of inner 
products L = L(8, $) = I; in other words, we have the biorthogonality property {9i,(j)j) — 
Sij. To show this, write the matrix L as / 0#*d7, where the integration is elementwise. 
Substituting for 0, we have / (L~^^)^*d'y. Since the integration is elementwise it commutes 
with L"^ and we have L^^ J ^^*d'^ = L^^L = I. The computational cost to construct 
is 0{N^{N + M)). 

When the matrix L in (|36|) is singular, we define = using the modified pseudo- 
inverse of Definition m By the same argument as before, we have IL-^-j^^^O = A^. The 
matrix J &^*dj evaluates to L-^L = I — J2i=N-Q+i'^i'^i ■ ^'^^ notational convenience in 
later sections, we will re-index our singular values and vectors so that the first Q generate 
the nuUspace, rather than the last Q. 

Remark 5 Within Configuration Interaction methods, the functions in $ and $ are taken 
from a master set of orthonormal functions, and is simply a signed permutation of $ 
so that 4)j = 9j for as many j as possible. This is known as the 'maximum coincidence' 
ordering. The construction we use generalizes this notion. 
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III. 6 Antisymmetric Inner Product with the Electron-Electron Po- 
tential yV Present 

In this section we derive formulas for computing antisymmetric inner products that include 
the electron-electron interaction potential. Although the derivation is somewhat messy, the 
resulting formulas are rather clean, and we use them verbatim in the computations. The 
main ideas are given in this section, and then reused in later sections for other cases. 



Proposition 6 When L from iS6]) is nonsingula 



is equal to 



where = L 



(40) 



(41) 



Proof: Using the maximum-coincidence procedure in Section llll.Si ([40|) is equal to |L| (6, yV$)_^. 

We reorganize and find that we must compute 



IH / I V 

2 TV! ./ 1 ^ llr, - r. 



N 



9i(7i) ^1(72) 
92(71) 6*2(72) 



9^(71) Sn{'J2) 



hilN) 
92(7JV) 



(in) 



rf7i • • • d-jN ■ (42) 



By moving the sum outside of the integral, wc can integrate in all directions except 7^ and 
7j. Using {9m,4>n) = Smn, WC obtain 



9 /VI ^ 



1 



2 TV! ^ 



'^'.(7)^1(7) 



••• 0,(7)^.(7) 



</',(7')^i(7') 



0,(7')^^ (7') 







••• <^,(7)^,(7) ••• 0,(7')^.- (7') ••• 



0,(7)0iv(7) ••• '^,(7')^iv(7') 



I + (0,(7)e(7) - e.) e* + (0,(7')0(7') " e,) e* 



d'jd'y' 



djdj' . (43) 



Since the inner matrix is a low-rank perturbation of the identity, we reduce its determinant 
using Proposition [2] and obtain 



9 wi 



9,(7) e,{i) 



djdj' . 



(44) 



The determinant is zero if j = i, so we do not need to explicitly prohibit it as we needed 
to in (|43p and above. The antisymmetrization has caused a convenient cancellation of a 
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fictitious self-interaction, and, thus, allowed us to decouple the two sums. Expanding out 
the determinant and rearranging the terms, we obtain 



d'y 



In our compact notation, this yields (|¥T|) . 



d-f . (45) 



□ 



We now consider the computational cost of (|¥T|) . In the first term in (HH), computing 
**© costs 0{NM), applying [•] to it costs O(MlogAf), and the integral in 7 costs 
0{M). In the second term, *0* costs 0{N'^M), applying [•] to it costs ©(iV^MlogM), 
applying 0* and then $ costs 0{N^M), and then the integral in 7 costs 0{M). Including 
the cost to construct 0, our total cost is 0{N'^{N + Af logM)). 

III.6.1 The Singular Case 

In this section we investigate the case when the matrix L from ([36]) is singular. Inserting 
the definition = L~^$ into our main formula (|¥T|) . we have 



In terms of the SVD ([29|l . we can express 

N 



h-^^d-f. 



L-i=^.s7iv,u; and |L| = |U||V*| J] 

j=l i 



(46) 



(47) 



Inserting these expressions into (j46p . we have 

iiuiivia^ 



N 



i=i 



AT 



k=l 



N 



_J=1 

N N 



N 



k=l 



j=l k=l i^j,k 



**VfeUt* 



u****Vfe ul^d-/. (48) 



If L is singular then at least one Si is zero, and only terms that exclude those from the 
product in are nonzero. Since we exclude two indices in the product, if more than two 
Si are zero then the entire inner product is zero. If exactly two are zero then only one term 
in the sum survives. If exactly one is zero then we can simplify from a double to a single 
sum, using symmetry. Recalling the modified pseudo inverse from Definition |4] and sorting 
the zero Si to the beginning for notational convenience, we obtain the following propositions. 
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Proposition 7 When the rank- deficiency ofL is more than two, the antisymmetric inner 
product ^U\ l evaluates to zero. 

Proposition 8 When the rank- deficiency of L is equal to two, the antisymmetric inner 
product \4U^ is equal to 



1 



\ht\N\ 



**V2U2* 



(49) 



Proposition 9 When the rank- deficiency of L, is equal to one, defining = or & 

iJ^, the antisymmetric inner product J^Qp is equal to 



**viuI*>Vp [**©] - **vi>V^ 



&d-y. 



(50) 



In computing (|49|) . constructing **vi, $*V2, uj$, and costs 0{NM), applying 
[•] costs 0{M\ogM) and, finally, the integral in 7 costs 0{M). In computing (f50|) . the 
first term costs 0{NM) to form $*0, 0{M log M) to apply [•], and 0(1/) to integrate 
in 7. The second term costs 0{NM) to form 0(iVM log M) to apply [•], 0(7VM) 

to apply 0, and 0{M) to integrate in 7. In total, the computational cost for the singular 
cases are less than the cost of the nonsingular case. 

Remark 10 In the Configuration Interaction context, rank- deficiency two corresponds to 
a double excitation. The vectors Ui and Vj would be zero except for a single entry, and so 
select the locations of the excited electrons out of ^ and Proposition\^ then reduces to 
the Slater-Condon rules '131. 



III. 7 Antisymmetric Inner Product with T and/or V Present 

Since T and V both have the structure of a sum of one-directional operators, we state the 
formulas for their sum, although of course they can be treated individually. 



dcf 



Proposition 11 //L from i f5^) is nonsingular, 
(Jl',(r + V)$ 

is equal to 



N / ^ \ N 



H I (r, + v.)[*]*0d7. 



(51) 



(52) 



Proof: We follow the same procedure as we used for the electron-electron operator W in 
Section fill. 61 Instead of (|43|) we have the simpler expression 



m ^ 



((T, +V,)[^,] (7)0(7) -e.) e* 



c?7 . 



Applying Proposition [2] we obtain ([5^ . 



(53) 



□ 



To analyze the computational cost to compute ((52)) . we note that it costs 0{NM) to 
apply {%, -|- V*)[-]. Including the cost for the maximum coincidence transformation, our 
total cost is thus 0{N'^{N -\- M)). 
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III.7.1 The Singular Case 

We now state the formula when L is singular. The analysis is similar to that for W in 
Section [mXTJ 

Proposition 12 If the rank- deficiency o/L is greater than one, 1151]) evaluates to zero. If 
it is equal to one we have 

J {% + V.) [*Vi]<*d7. (54) 
To compute dM]), it costs 0{NM) to form **vi and u**, and 0{M) to apply {% + V,) [•]. 



IV Incorporating Delta Functions into the Antisym- 
metric Inner Products 

In this section we show how to compute antisymmetric inner products when one of the 
component functions is replaced by a delta function. For concreteness, we will replace 
01 (71) by (5(7 - 71). 



IV. 1 Lowdin's Rule with 5(7 — 71) Present 

The matrix L from (|36p is defined by L{i,j) — {(pi, <j)j). If we replace 0i(7i) by (5(7 — 71), 
then the first row depends on 7 and is given by L(l,j) = ((^(7 — = 0j(7)- We thus 

have a matrix that depends on 7, 



L(7) = 



01(7) 02(7) 

((^2,0l) (</>2,02) 



0Af(7) 
(02, 0^) 

[4'N,(f>N) 



(55) 



To compute with L(7) without resorting to cofactor expansions, we express L(7) as a rank- 
one perturbation of a matrix of numbers. Define 



E 



d(l) 



d(2) 



diN) 



02, (PN 



(56) 



where the vector d* is chosen to be a unit vector orthogonal to the remaining rows of E. 
This choice assures that the rank deficiency of E will be smaller than or equal to the rank 
deficiency of the matrix with any other first row. It also gives us some convenient properties, 
namely Ed = ei, d*E* = e^;, E^ei = d, and eJE = d*, where E* is the modified pseudo- 
inverse of Definition U) It costs 0{N'^M) to construct E and 0{N^) to compute and 
|E|. 

We then have 

L(7) =E + ei(*(7)-d)* (57) 
and, with the help of Proposition [21 compute 



|L(7)| = |E||I -t- d(*(7) - d)*| = |E| (1 + (*(7) - d)M) = |E| $(7)*d , 



(58) 



which yields 
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Proposition 13 

/ N N \ 

{s{l-ll)l[Ml^),l['i'^il^)) =|E|*(7)*d, (59) 

\ 4=2 i=l / X 

where E and d are defined as above. 

Remark 14 If i > 1 then 

(|E|**d,0,) = |E|(*>,)*d= |E|^;(^,•)*d = 0, (60) 

since d is orthogonal to E{i,-), which is row number i o/ E. Thus the function I159\) is 
orthogonal to (j)i for i > 1. The same property will hold when the operators T , V , and W 
are present in the antisymmetric inner product, as described in the following sections. 

IV. 2 Antisymmetric Inner Product with 5(7 — 71) and (T and/or 
V) Present 

To compute antisymmetric inner products involving operators, we will modify formulas from 
Section IIIII The first (trivial) modification is to denote the variable of integration in those 
formulas by 7', so as not to confuse it with the variable 7 in (5(7 — 71). Next we replace |L| 
with |L(7)| given by ([55]) . Using ([Ff)) . we can express 

L(7)-i = (E + ei(*(7) - d)*)-^ = (E (I + d(*(7) - d)*)))-^ 

= (I + d(*(7)-d)*)-iE-i. (61) 

Using the Sherman-Morrisson Formula (see e.g. [21] and (|B5|) in Appendix iBjl we then have 

d(*(7)-d)* , ,(d-*(7))*\^-i 



L(7)" = I Wt^ ;V-r IE' = I + d ' , , E~\ (62) 

V i + (*(7)-d)M; V *(7)*d ; ^ ' 

The vector of functions 0, which was defined by L^^$, now depends on the variable 7 
in (5(7 — 71) as well as its own internal variable 7'. Replacing L^^ with ([62]) and $ with 
$(7') + 81(^(7 — 7') — (/)i(7')), we obtain 

0(7,7')= (l + d^^^^^^) E-i(*(7')+ei ((5(7-7') --^i (7'))) • (63) 

To compute it, we first compute the base case 0(7') = E^^$(7'). Multiplying out ([63]) and 
noting d*0 = d*E-''^ = ^1, we obtain 

0(7, 7') = 0(7') + d^:®(2i^(^)*®(^') + - 



*(7)*d 



We are now ready to state our main formulas. 
Proposition 15 When E is nonsingular, 

IN N 



- 71) n '^^(t^)' + n '^'(^^) ) (65) 
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is equal to 



m 

m 



*(7)* (^d J {% + V*) [^]*®di -J {%+ V,) [#*d]©d7'^ 



+ (r,+V,)[**d](7)] , (66) 

which can he computed with total cost 0{N^ + N'^M). 

Proof: To compute (|65|) . we start with / (T* + V*) [$]*0d7' from ([52]) and substitute 
in (inSl) and ([Ml) to obtain 

/ (T. + V.) [*](.')• - , *h)'ehy(7-y) j , 

Distributing out and rearranging, we have 

^ 1 *(7)*d(r, + V,) [*]*(7')0(7') - + V.) [*](7')*d*(7)*0(7') 

+ {% + V,) [*] (7')*d(5(7 - 7')d7' , (68) 

which yields ([M)) . Although in (|62p and we divide by $*d, which could be zero, this 
denominator cancels in the final expression, so we can argue by continuity that the final 
expression is still valid. One can also prove this directly by determining the nuUspace of L 
and then using ([54|) . □ 



Remark 16 It is the term with pointwise multiplication, {T^ + V*) [$*d] in I6'6]) . that al- 
lows adaptive refinement around the nuclei in the numerical algorithm. 

To obtain the formulas when E is singular, we follow the same logic as in Section llII.6.11 
Denote the singular vectors in the nuUspace of E by {(ui, Vi)}. 



Proposition 17 When E has rank deficiency greater than one, i65\) is zero. When E has 
rank deficiency one, \65\} is equal to 

(69) 



i^*(7)* (d I (r,+V,)[**vi]ut*d7'-vi I (T,+V,)[**d]ut*d7') , 



\Ei\m 

which can be computed with total cost 0{N'^ + N'^M). 

IV. 3 Antisymmetric Inner Product with ^(7 — 71) and W Present 

Conceptually the derivation if W is present in the inner product is the same and we obtain 
the following propositions. 

Proposition 18 When E is nonsingular, 

I N N \ 

(70) 



an Unconstrained Sum of Slater Determinants 



19 



is equal to 



m 

2Nl 



(*(7)" 



**0 



(7) - *(7)*W, 



0**d 



(7) 



*(7)* ( d / **0>V, 
-2 / &W. 



**0 



0*' 



**0 



**d 0**>V 



0d7' 
0#*d 



which can be computed with total cost 0{N^ + TV^A/logM). 
Proposition 19 When E has rank deficiency one, |70[ ) is eguaZ to 



(*(7)^ 



**viu** (7) - *(7)*viW^ 



|Et|iV! 

+*(7)* {a I **vi (Gt*W^ **© 
0(**vi>Vp ut***d 
vi / **d (u^^W^ [**0 



u****d 



0) d7' 



(7) 



**d) d-y' 



$*ViU*$ 



uj*** 0)^7' 



which can he computed with total cost 0{N^ + N'^M + NMlogM). 
Proposition 20 When E has rank deficiency two, |70[ ) is equal to 



\Ei\m 



*(7)' 



d / **viU**W^ 



$*V2U2$ 



**V2Wp 



Ui*d7 



Vi / **V2U2*>Vp 



**duj* 


- **V2>Vp 


u^***d 


ul^d'y 


**dU2* 




ui#**d 


ul^d-y 



(71) 



(72) 



(73) 



-^2 J **viUi*>Vp 

which can be computed with total cost 0{N^ + NM + Af log Af). 

V Details of the Green's Function Iteration 



In this section we fill in the missing pieces in the Green's function iteration algorithm 
outlined in Section [il.2l First we give a representation for the Green's function itself. Then 
we use the methods in the previous sections to construct the vector b in ((T8l) and the matrix 
A in ([T7]) to form the normal equations ([T5]). Next we give the algorithm from Section |TL2] in 
outline form as pseudocode. Finally we gather the computational cost of the whole method, 
and present some linear algebra techniques to reduce it. 

V.l Representing the Green's Function 

In this section we construct a separated representation for the Green's function in ([7]), 
following the ideas in [4l[5] (see also [22l[23j). We will use this representation in Section lV.2l 
when constructing the right-hand-side of the normal equations. 
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We begin by constructing an approximation of 1 /t with exponentials such that 



1 ^ 

- - ^Wpexp(-Tpi) 



(74) 



on the interval t €z [1, oo), with Wp and Tp positive. Expansions of 1/t into exponentials have 
been used in several applications and constructed by diverse techniques; see [8] [29l [59l [6l |9j 
[24] and the references therein. The interval [1, oo) is addressed specifically in [9], where it is 
shown that the error rate e — C'(exp(— ca/L)) can be achieved, which means we can achieve 
i^O((lne)2). 

Substituting t — s/{—^) for ^ <Q into (I74p and dividing by — /x, one has 



« — /i — /i 



< 



valid on the interval s e [— /x,oo). In Fourier coordinates, we can express 

1 



(75) 



(76) 



from which we see that = 
substitute into ((75|) and obtain 



l/(— /x). Since the denominator is at least — /x > 0, we can 



N 



^.-E^^-^^(8)-p(-^ef) 



< — = 4Q,\ 



(77) 



Thus we obtain an approximation of with relative error e in norm using L terms, with 
L independent of N and /x. To construct as an integral operator in spatial coordinates, 
we apply the inverse Fourier transform to obtain 



L N 
p—1 i—1 

where the convolution operator !F^.^ which depends imphcitly on ^, is defined by 



(78) 



Hifili-, ■■■,1n) 



l/N 



2ttt, 



exp 



p . 

/II 2 



2t„ 



3/2 



/(7i 



,7,_i, (r',cr,),7,+i, . . . ,7Ar)dr' . (79) 



This construction has theoretical value, since it has proved the following theorem. 

Theorem 21 For any e > 0, /i < 0, and N , the N -particle Green's function has a sepa- 
rated representation with relative error in operator norm bounded by e using L — ©((Ine)^) 
terms, with L independent of fi and N . 



V.2 Constructing the Right-Hand-Side Vector b in (1181) 

In order to do a step in the iteration, we need to construct the right-hand-side b in the 
normal equations (fT5|) in Section III. 2. 21 Since A is an orthogonal projection, A and Q^^ 
commute, and is self-adjoint, the entry (flS)) is equal to 



I N N \ 

-4a^<5(7 - 7i) n '^'(^^)' ["^ + H'^riT.) 

\ 1=2 i=l / 



(80) 
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Substituting (|78p in for and rearranging, we have 



N 



N 



6(0(7) 



-^rA-f - 7l) n-^ri'(7.), [V + W] J] 'I^T il.) 



(81) 



1=2 



The computation is of the same form for each value of the indices I, to, and p, so we can 
consider a single term and suppress the indices. 

To evaluate a single term {AJ'ri5{'j - 71) Y[^^2 -^r Mli), [V + W] JJ^^i i'iili)) we use 
the formulas in Propositions [T5H^ in Sections IIV.2I and IIV.3[ with two modifications. The 
first modification is that $ is replaced with JF4> throughout. This replacement causes no 
structural change to the formulas; it just changes the inputs. The second modification is 
caused by the replacement of (5(7 — 71) by JFri'5(7 — 71). The first row of L(7) in ([55]) 
becomes ^^$(7)*, which makes |L(7)| = \E,\T^{-^)*d. Similarly, ([M]) becomes 



0(7,7') = 0(7') -d 



J^*(7)*d 



(82) 



Tracking J- through the formulas, we find that all we need to do is to modify the formulas 
in Sections IIV.2I and IIV.3I by applying T to the final result. 



V.3 Constructing the Matrix A in ( fTTll 

In this section we construct the kernels in (jl7p for the normal equations (|15p , using the same 
ideas as in Section ITVl We fix I and V and define 



if (7, 7') = 
w(7') = 

y(7) = 



SlSl> 

[4>\{i) ... 

[0^(7) ... 



Using Lowdin's rules (p7|) we have 

if (7, 7') 



ILI 



1 



iV! N\ 

Expressing L as a low-rank perturbation of 



-5(7-7') y*(7) 
w(7') B 



1 

© 



we have 



(83) 

(84) 
(85) 

(86) 



(87) 



^(7,7') = T7T 



1 

N\ 
1 

TV! 



" 1 " 




" 1 " 


D 


+ 





1 






" 1 " 


B 




1 + 






y*(7) ] + 
y*(7) ] + 



TV! 



1 y*(7)E)"'w(7') 
1 <5(7 - 7') 



<5(7 - 7') - 1 

w(7') 
5(7 - 7') - 1 



[1 0] 
[1 0] 



= y (-5(7-7') -y*(7)B-M7')) . (88) 
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If D is singular then we apply the same logic as in Section llII.6.11 If D has rank-deficiency 
greater than one then K{^,"f') ~ 0. If it has rank-deficiency one then we have K{'^,^') = 



1 



+ 







[ V* ] 



y*(7) 

v*I])tw(7') 
-y*(7)v 1 y*(7)©*w(7') 
1 5(7 - 7') 



<5(7 - 7') - 1 
I])tw(7') 



[1 0] 



-(y*(7)v)(v*tfw(Y)) 

|Dt|iV! 

_ -(y*(7)v)(u*w(70) 
|Dt|iV! 



(89) 



where is the modified pseudo-inverse of Definition 01 

In the nonsingular case, we can construct D at cost 0{N'^M) and compute D^^ at cost 
0{N'^). Applying this kernel costs 0{NM) to integrate against a function in 7', 0{N'^) 
to apply D~^, and then 0{NM) to apply y* to the result. In the singular case, we can 
compute U)^ at cost 0{N^) and construct y*v and u*w at cost 0{NM). Since the variables 
separate, applying this kernel costs 0{M). 

Remark 22 In the case r — 1, which corresponds to the Hartree-Fock formulation, 1} — I 
and K{'~f,j') is just the projector orthogonal to {0j}^2- 



V.4 Pseudocode 

In this section we give the algorithm in outline form as pseudocode. We do not indicate 
when objects can be recalled or updated from previous computations. 



Loop through / Green's function iterations (|9ll0ll3p . For each of these: 
Construct as in Section lyTTl obtaining the operators JF^ in ([7^ . 
Loop through the TV directions (electrons). For each of these: 
Compute A{IJ') via §^ for aU {1,1'). 
Compute 6(0(7) in (ED) by: 

Loop in the r values of I and for each: 
Sum over the L values of p and for each: 
Compute !F^(j>l for all i. 
Sum over the r values of m and for each: 
Using !F^i in place of construct E in 
Compute |E| and E"^ 
Construct = E^^J^*. 
Construct **0, **d, and 0**. 



Compute 



**0 



0** 



and W, 

Compute (pS)) and (|7T|) using these ingredients 
Apply Jf^P to ((Ell) + dZHl). 
Apply conjugate gradient to solve the normal equations ([15]). 
Renormalize as in ^TU\\ . 
Update fi via 

Remark 23 We have presented the algorithm in serial form for clarity. The loop in I, 
sum in p, and sum in m can be trivially parallelized. Parallelizing the loop through the N 
electrons would represent a change in the algorithm, which we will develop elsewhere. 
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V.5 Overall Computational Cost 

The computational cost is dominated by the repeated construction and solution of the 
normal equations ([TS]). For a fixed direction, the construction cost is dominated by (|8ip . 
which has r^L inner products. The most costly portion of the inner products is (j7ip . which 
requires 0{N^ + M log M) operations, giving us the net construction cost 

0{r'^LN^{N + MlogM)) . (90) 

The operation count to solve the normal equations (|15|) by applying the matrix of integral 
operators A 5* times is 

0{r^SN{N + M)). (91) 

As we loop through the directions, we may reuse several quantities, so the total cost of 
the construction is less than N times the cost for one direction. In fact, the construction cost 
for the entire loop through N directions is of the same order as the cost for one direction. 
The application cost is simply multiplied by N. In the sections below we show how to 
update the construction for direction k = 2 using what we already have for direction fc = 1, 
and then determine the cost for one loop through the directions. We defer the development 
of the technical linear algebra rules on low-rank updates to Appendix |b1 and here only show 
how to apply them to our problem. Our final conclusion is the computational cost 

0{Ir^N^[L{N + MlogM) + S{N + M)]), (92) 

where / the number of Green's function iterations. 



V.5.1 Reuse in Computing 



Let Bi denote D in ((86)) for directions one, and I 
denote the updated version of (/){ . To construct 
of Di to be updated, specifically 



)2 the version for direction two. We let (pl 
[I>2 requires only the first column and row 



J'2 = 



ei 



[0 



] + 



(93) 



Computing those inner products involving 4)\ and (t)\ costs 0{NM). Using Proposition [Ml 
twice, we compute ID)|, |D||, and if appropriate v, all at cost 0{N'^). The formulas ([87| and 
following are modified by inserting the extra column and row in the second place instead of 
the first, but otherwise the procedure is unchanged. The cost for one loop through the N 
directions is thus 0{N^ + N'^M). 



V.5. 2 



Reuse in Computing Antisymmetric Inner Products with ^(7 
Operators 



71) and 



We again let 0^ denote the updated version of 4)\ computed during the k = \ solve. The 
inner products needed to construct E2 require only the one row involving 0i to be updated, 
at cost 0{NM). The vector di can be constructed by doing the SVD of Ei with the first 
row set to zero and then selecting one of the right singular vectors ^f^ with zero singular 
value. Using Proposition [M] we obtain the SVD of E2 with first row set to zero and second 
row containing the new inner products, and thus can find d2. Putting the first and second 
rows back in proper position, we then have 



E2 =Ei +ei ([ {T4>u<t>i) 



{T4>u<t>N) ] -dt) 

+ e2(d*- [ ■■ 



{T4>2,M ]) , (94) 
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and we can compute IE2I and using Proposition [Ml twice, at cost 0{N'^). 

Proposition 1241 produces a rank two update and we must apply it twice. For notational 
ease we will show how to use a rank one update applied once; the method easily extends. 
Assuming e| = + fg* , we next update 



02 = E|.F*2 = (EJ + fg*)(.F*i + ei(0i - 0i)) 

= ei+di(0i 



61) + fg*J^*i + fg*ei(0i 



(95) 



at cost 0{NM). It is insufficient to just update 02 in this way, since it would still cost 
0{N'^M\ogM) to compute ©2$* in ([7T|l . Instead we update the combined quantity 



02** 



01** 



**diW^ 



**fw^ 



g*j^*i*' 



**fg*ei>V^ 



)* (96) 



at cost 0{NM log M). With this quantity and ©2 we can compute (HIl) at cost 0{NM log M). 
The singular cases work similarly. The cost for one loop through the N directions is thus 
^(TV^MlogM). 
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A Appendix: Algorithms Based on Gradient Descent 

We prefer the integral iteration in Section III. 2. II due to the generally superior numerical 
properties of integral formulations. One could, however, try to minimize ([4]) directly with a 
method based on gradients. Since the machinery that we have constructed applies to these 
methods as well, we sketch how it can be used. 

To minimize Q we could use a gradient descent, starting at some initial guess for ijj. 
Inserting our current approximation ^ and formally taking the gradient, we have 

^ {-H^p, VV;)^(^, - {Hj^, Vy^)^ ^^^^ 

Defining /i to be our current value of the gradient reduces to 

((H^, V^)^ - VV)^) . (A2) 

The gradient is with respect to the parameters that are used to minimize (j4]). In our case 
that is the values of the functions Taking the gradient with respect to the point values 
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of (fij results in a vector g of functions, defined by 

r / w N \ 

<?j(7) = T--^SiE^™(^(^-^^)n'^'(^^)'(^-/^^)n'^"(^»)) ' (A3) 

where <5(7 — 7^) is the delta function. The methods in Section |IV] can be used to construct 
g- 

Moving t in the direction opposite the gradient replaces ip with 

r N 

/=i 1=1 

Some search procedure can then be used to find an appropriate t. Then -ip is updated and 
the procedure repeated. 

Alternatively, we can use an alternating direction approach and take the gradient with 
respect to the functions 4'\ for one direction i, while fixing the functions in the other di- 
rections, and then loop through the directions. This loop through the directions is then 
repeated / times until we obtain the desired accuracy. We describe the i — 1 case. Moving 
t in the direction opposite the gradient replaces ip with 

r N r N 

E ('^'i -*9[)Y[^'i = y^-tT. '^9[ l[4 = iJ-t4,. (A5) 

1=1 i=2 1=1 1=1 

Inserting (jA5P into ([4]) results in 



(A6) 



Once the inner products have been computed, we can find the niinimizer for (jA6p by solving 
a quadratic equation, and then update ip via (|A5p . The cost to construct g for one direction 
is times the cost for one inner product. The dominant cost for the inner product comes 
from (UH), which costs 0{N^ + TV^MlogM), giving us the net construction cost 

0{r'^N'^{N + M log M)) . (A7) 

As described in Section IV. 5. 21 many of the computations can be reused, so the cost for a 
single loop through the N directions is of the same order. Thus, for / loops through the 
directions the overall computational cost is 

0{Ir'^N'^{N + M log M)). (A8) 



B Appendix: Low-rank Updates 

In this section we develop formulas for low-rank updates to A^, A-"- and \^^\, based on 

[ma. 

Proposition 24 Given A, , A-^, \A^\, h, and c, let Ai = A + be* and compute 

d = Atb, e=(At)*c, f=(I-AAt)b, g = (I - AtA)c, 
d = d*d, e = e*e, / = f*f , 5 = g*g, 

A = l_+c*Atb, /i=|A|2 + rfg, jy=|A|2 + e/, ^ ^ 

p = Ad + dg, q = Ae -I- ef . 
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1. // A = 0, / = 0, and g = 0, then rank{Ai) — rank{A) — 1 and 

A| = At - d~idd*At + e-\-A^e + d-\d*A^e)d)e* 
A^=A^ + {1/Vdl)de* 
|aJ| =-(l/Vdi)|At|. 

2. //A 7^ 0, / = 0, and g — 0, then rank{Ai) = rank{A) and 

A\ = At - A- Me* 
A^ = A^ 
\AI\ = |At|A-i . 

3. If f = and g =/= 0, then rank(Ai) — rank{A) and 

A\ = At - /i-M(.gd*At + Ae*) + n-^g{-de* + Ad* At) 
A-^A--MVM^^)i±^g*A- 

4- If f 7^ ^ '^'^'^ 9 = 0) then rank{Ai) — rank{A) and 

A\ = At - u-\fA^e + Ad)e* + v-\-ed + AAte)f* 
Af = A^ - 



.(|A|(yi;-|A|)f + A/e)* 



/|A|^^ 



|Al| = |At| 



(A- A)|A|2 + Ai^ 



5. // / 7^ and g 0, then rank{Ai) = rank{A) + 1 and 

A\ = At - f ^dr + g"^g(-e* + A/-^f*) 

A^ = A^ - (1/V57)gf* 

I At I = I At I [l + (.9-1/-1 - (l/y^))g*A^f 



The cost to compute A\, Aj^ , and |Af | is 0{N'^). 

Proof: The overall method, update rules for rank{Ai), and update rules for 
from T , who also list the useful properties 

c*d = e*b = A - 1, b*f = /, c*g = g, d*g = 0, e*f = 0, 
AtAd = d, AAte^e, A*f = Atf = 0, Ag = (At)*g = 0. 



(B2) 
(B3) 
(B4) 



(B5) 
(B6) 
(B7) 



(B8) 
(B9) 

(BIO) 



(Bll) 
(B12) 

(B13) 



(B14) 
(B15) 
(B16) 



are taken 



(B17) 



They give update rules for the row and column spans of Ai , which we translate into update 
rules for A^ . The cases (IB3p , (jB6l) , and (IB15[) follow directly. Corresponding to (|B9p , their 
update rule is that the row span of A-^ should be extended (orthogonally) by d and then 
reduced by projecting orthogonal to p. We translate this into a (Householder) reflection 
of the vector g into a vector in the span of d and g perpendicular to p. Adjusting these 
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vectors to have equal norm and real inner product yields the reflection of the vector Xy/JIg 
to — \X\{gd ~ Ag), resulting in 

2(AV7Ig + |A|(.gd - Ag))(Av]Ig + |A|(ffd - Ag))* \ ^ ^ 

||(AV7^g + |A|(gd-Ag))|P ; ' ^ 

which simplifies to (jB9[) . To obtain (|Bf 2p we use the same process, extending the column 
span by e and then projecting orthogonal to q by a reflection of to — |A|(/e — Af). 

To derive the update rules for |Af |, flrst add the update rules for A| and A^"- and then 
take the determinant. On the right hand side factor out a copy of A^ leaving a low-rank 
perturbation of the identity, to which we can apply Proposition [2] To simplify the results, 
we use (IB1[) . (jB17p . and the further observations 

(At)-M = b-f, (At)-iAte = c-g, (At)-ig=(A^)*c, 
e*(At)-i =c*-g*, f*(At)-i =b*(A^)*. ^ ' 

To obtain (jB4p we compute 



|aJ| = |A*| I - d-ibd*Al' + ((l/\/de)b - e-^e + dr^ e'^ {<\* 



l-d-id*Atb d*At((l/Vd^)b- 



-i(d*Al'e)b) 



~d-ie*b 1 + e*((l/\/d^)b - er^e + d'^ e-^ {A* 

d*At(l/Vd^)b 

e*((l/\/di)b + d-ie-i(d*Ate)b; 



|At|(-(l/\/d^)). (B20) 



For dHZl) we have |aJ| = [A*] |l- A^^be*! = |At|(l - A-ie*b) = |A*|A-i . To obtain (|BT0)) 
we compute 



|At| 



I + (A*)-i ( d(-^-i(gd*At + Ae* 



Ag* 



-) 



+g(Ai-i(-de* + Ad*i 



|A|Vm 

(v7I-|A|)g*A^ 



I A* I 



1 + (-Ai"i(gd*At + Ae*))b (^-^(-de* + Ad*At))*b 

(-Ag*A^/|A|V7I)(A^)*c 1 - ((V7^- |A|)g*A^/5VM)*(A^)*c 



|A*| 



-A.g/|A|VM |A|/Vm 
A similar calculation yields (|B13p . To obtain (|B16p we compute 



(A- A)|Ap + A^ 

a*|A|Vm 



(B21) 



|At I I + (A*)-i (-/-Mr + g(5-^(-e* + A/-ir) - (1/ V57)f^ 

l + (.g-iA/-i-(l/V^))f*(A^)*c 
= |A*|(1 + ((-(1/^5/)) + <?-V-')f*(A^)*c) 

= |At| [l + Cg-i/-' - (l/v^))g*A^f 



(B22) 
□ 



When A and Ai are nonsingular, (|B5p is the Sherman-Morrisson Formula (see e.g. jU). 
For our application we need the singular vectors in A^ , rather than A^ itself, but then only 
when rank{k'^) < 3. These singular vectors can be extracted by a simple modification of 
the power method with deflation. 
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